Shallow Transfer Between Slavic Languages
نویسندگان
چکیده
This paper describes an architecture of a machine translation system designed primarily for Slavic languages. The architecture is based upon a shallow transfer module and a stochastic ranker. The shallow transfer module helps to resolve the problems, which arise even in the translation of related languages, the stochastic ranker then chooses the best translation out of a set provided by a shallow transfer. The results of the evaluation support the claim that both modules newly introduced into the system result in an improvement of the translation quality.
منابع مشابه
Rapid development of RBMT systems for related languages
The article describes a new way of constructing rule-based machine translation systems (RBMT). RBMT systems are currently among the best performing machine translation systems. Most of the "big named" machine translation systems (Systran, 2007)(Promt, 2007) belong to this category, but these systems have a big drawback; construction of such systems demands a great amount of time and resources, ...
متن کاملControl and Cybernetics a Method of Hybrid Mt for Related Languages *
The paper introduces a hybrid approach to a very specific field in machine translation — the translation of closely related languages. It mentions previous experiments performed for closely related Scandinavian, Slavic, Turkic and Romanic languages and describes a novel method, a combination of a simple shallow parser of the source language (Czech) combined with a stochastic ranker of (parts of...
متن کامل"Reading Polish with Czech Eyes" or "How Russian Can a Bulgarian Text Be?": Orthographic Differences as an Experimental Variable in Reading Comprehension
The human language processing mechanism shows a remarkable robustness to different kinds of imperfect linguistic signals. However, it is unclear how exactly a message encoded in one system is decoded by persons used to a different system. We are interested in gaining insights about human performance at retrieving information encoded in an unfamiliar encoding system. Our focus lies on reading in...
متن کاملThe MULTEXT-East Morphosyntactic Specifications for Slavic Languages
Word-level morphosyntactic descriptions, such as “Ncmsn” designating a common masculine singular noun in the nominative, have been developed for all Slavic languages, yet there have been few attempts to arrive at a proposal that would be harmonised across the languages. Standardisation adds to the interchange potential of the resources, making it easier to develop multilingual applications or t...
متن کاملGenetic Heritage of the Balto-Slavic Speaking Populations: A Synthesis of Autosomal, Mitochondrial and Y-Chromosomal Data
The Slavic branch of the Balto-Slavic sub-family of Indo-European languages underwent rapid divergence as a result of the spatial expansion of its speakers from Central-East Europe, in early medieval times. This expansion-mainly to East Europe and the northern Balkans-resulted in the incorporation of genetic components from numerous autochthonous populations into the Slavic gene pools. Here, we...
متن کامل